Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming

نویسندگان

Alexandros Agapitos

Michael O'Neill

Anthony Brabazon

چکیده

Nearest Neighbour (NN) classification is a widely-used, effective method for both binary and multi-class problems. It relies on the assumption that class conditional probabilities are locally constant. However, this assumption becomes invalid in high dimensions, and severe bias can be introduced, which degrades the performance of the method. The employment of a locally adaptive distance metric becomes crucial in order to keep class conditional probabilities approximately uniform, whereby better classification performance can be attained. This paper presents a locally adaptive distance metric for NN classification based on a supervised learning algorithm (Genetic Programming) that learns a vector of feature weights for the features composing an instance query. Using a weighted Euclidean distance metric, this has the effect of adaptive neighbourhood shapes to query locations, stretching the neighbourhood along the directions for which the class conditional probabilities don’t change much. Initial empirical results on a set of real-world classification datasets showed that the proposed method enhances the generalisation performance of standard NN algorithm, and that it is a competent method for pattern classification as compared to other learning algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient weighted nearest neighbour classifier using vertical data representation

The k-nearest neighbour (KNN) technique is a simple yet effective method for classification. In this paper, we propose an efficient weighted nearest neighbour classification algorithm, called PINE, using vertical data representation. A metric called HOBBit is used as the distance metric. The PINE algorithm applies a Gaussian podium function to set weights to different neighbours. We compare PIN...

متن کامل

Nearest Neighbour Distance Matrix Classification

A distance based classification is one of the popular methods for classifying instances using a point-to-point distance based on the nearest neighbour or k-NEAREST NEIGHBOUR (k-NN). The representation of distance measure can be one of the various measures available (e.g. Euclidean distance, Manhattan distance, Mahalanobis distance or other specific distance measures). In this paper, we propose ...

متن کامل

Eecient Nearest-neighbour Searches Using Weighted Euclidean Metrics

Building an index tree is a common approach to speed up the k nearest neighbour search in large databases of many-dimensional records. Many applications require varying distance metrics by putting a weight on diierent dimensions. The main problem with k nearest neighbour searches using weighted euclidean metrics in a high dimensional space is whether the searches can be done eeciently We presen...

متن کامل

Hesitant Fuzzy k-Nearest Neighbour (HFK-NN) Classifier for Document Classification and Numerical Result Analysis

This paper presents new approach Hesitant Fuzzy K-nearest neighbour (HFK-nn) based document classification and numerical results analysis. The proposed classification Hesitant Fuzzy K-nearest neighbour (HFKnn) approach is based on hesitant Fuzzy distance. In this paper we have used hesitant Fuzzy distance calculations for document classification results. The following steps are used for classif...

متن کامل

Improving k-Nearest Neighbour Classification with Distance Functions Based on Receiver Operating Characteristics

The k-nearest neighbour (k-NN) technique, due to its interpretable nature, is a simple and very intuitively appealing method to address classification problems. However, choosing an appropriate distance function for k-NN can be challenging and an inferior choice can make the classifier highly vulnerable to noise in the data. In this paper, we propose a new method for determining a good distance...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming

نویسندگان

چکیده

منابع مشابه

An efficient weighted nearest neighbour classifier using vertical data representation

Nearest Neighbour Distance Matrix Classification

Eecient Nearest-neighbour Searches Using Weighted Euclidean Metrics

Hesitant Fuzzy k-Nearest Neighbour (HFK-NN) Classifier for Document Classification and Numerical Result Analysis

Improving k-Nearest Neighbour Classification with Distance Functions Based on Receiver Operating Characteristics

عنوان ژورنال:

اشتراک گذاری